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ANALYSIS OF DATA SYSTEMS REQUIREMENTS 
FOR GLOBAL CROP PRODUCTION FORECASTING 
IN THE 1985 TIME FRAME 

INTRODUCTION 


The Outlook for Space [ 1] defines several objectives for the application 
of space technology to desirable and practical activities for the 1980 to 2000 
time frame. One of the objectives set forth is a Global Crop Production Fore- 
cast, objective Oil [1], Global Crop Production Forecasting was selected as 
one of the objectives to be analyzed because preliminary analyses indicated that 
it would be a major driver of the data system in the time frame under considera- 
tion. 


The objective was analyzed, and potential users were interviewed and 
surveyed. Then the objective was revised and quantified. Projected information 
requirements were obtained from the user community; the impact of these 
requirements on a conceptual data system was analyzed; potential problem areas 
were identified; recommendations were made to overcome these problems; and 
future work in specific areas that need more indepth analysis was identified. 


OBJECTIVE FROM OUTLOOK FOR SPACE 


The objective of global crop production forecasting is to provide a 
biweekly forecast of the global production of major crops having worldwide 
food and/ or economic significance. There is a need for such a system because 
the increasing world population will require an increasing production of food. 

Approximately 98 percent of the world* s food comes from the land and 
approximately 2 percent comes from the sea. The best conceivable management 
can do no more than double this amount of food from the sea [2] ; therefore, the 
increase in available food must come from the land. Most of the world’ s food 
comes from grain such as wheat, rice, and corn. The reserve of the world food 
resource has shrunk from 26 percent of the annual consumption in 1959 to 7 
percent in 1974, North America is the only major exporting region in the world. 


and food exports are a major factor in U. S. World trade and balance of pay- 
ments [ 1] . Better global crop production forecasting could provide better 
information concerning impending crop failures and the resvilting food shortages 
and better decisions on the transporting and distribution of the available food. 

A global crop production forecasting system must be able to accurately predict 
the production of the important food and fiber crops if it is to be useful to the 
’’food managers” of the world. 

An accurate global crop production system offers a variety of potential 
benefits. Aside from the humanitarian benefits, there are benefits of national 
policy and economic benefits. Since North America is the only major exporting 
region of the world, earlier and better information about world crops could help 
the U. S. and other countries better manage their agricultural production and 
minimize fluctuation in price and trade volumes. Grain exports could be better 
planned with less disruption of domestic markets. Decisions on planting, 
marketing, and transportation requirements could be improved. To provide 
these benefits, the forecasting system must be timely and accurate. 


ASSUMED USERS 


There are numerous users of a global crop forecasting system in the 
public and private sectors. These users vary from large government agencies 
and private marketing organizations to individual scientists engaged in research. 
Literature was reviewed, potential users interviewed, and installations were 
surveyed to determine which users wovild reap the greatest benefit from a global 
crop forecasting system. To put realistic bounds on the overall objective, two 
primary users, the Agency for International Development (AID) of the U. S. 

State Department and the Foreign Agricultural Service ( FAS) of the U. S. 
Department of Agriculture (USDA) , were selected. These are the two agencies 
of the U. S. Government which are most directly involved in the policy making 
decisions concerning the exporting of U. S. Agricultural products. The objective 
was redefined and the conceptual data system was defined to meet the require- 
ments of these two users. 
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QUANTIFIED STATEMENT OF OBJECTIVE 


The objective as stated in Reference 1 was redefined and quantified, based 
on our concept of the requirements of AID and FAS. The revised objective is 
as follows: 

To provide a biweekly forecast of the global production of major 
crops having worldwide nutritional and/or economic significance. 

The primary goals were to maintain and strengthen the U. S. 
balance of trade, to support U. S. foreign policy decisions, and 
also to assist in alleviating the world’ s famine. The forecast 
would cover seven principal crops which are important to the 
U. S. trade and would cover the principal producing countries and 
regions of the world. A long range (9 month) and a short range 
( 3 months before harvest) forecast would be provided on a biweekly 
basis within a one week time frame from data gathering to user. 

The long range forecast would be 95 percent accurate with a 90 
percent probability, and the short range forecast woTxld be 98 
percent accurate with a 90 percent probability. 

At the present time, the data available on crop production throughout the 
world vary widely. The U. S. has the benefit of a highly sophisticated and usually 
very accurate crop production forecasting system provided by the Statistical 
Reporting Service (SRS) of the USDA. Other major food producing countries 
with sophisticated systems include Canada, Australia, United Kingdom, and 
USSR; however, the data from these countries may not always be available to 
the decision makers in AID and FAS. Less sophisticated systems are used in 
most of the Western European Countries, parts of Central and South America, 
India, and some parts of Africa. Data in the rest of the world are obtained 
from very simple systems or are nonexistent. The type and location of the crop 
production forecast systems throughout the world are shown in Figure 1 [ 3] . 

A global crop production forecasting system which would provide informa- 
tion having the accuracy and timeliness stated in the revised objective would 
provide better information than that now available and would facilitate better 
decisions relating to the production and exporting of food and fiber. 
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SCOPE OF OBJECTIVE 


The crop production forecast is influenced by the area in cultivation for 
a particular crop (usually stated in hectares or acres) , and tlie yield of the crop 
per unit area (usually in quintals/ hectare or bushels/ acre) . The production 
is computed by the formula: 


Area x Yield = Production 


( 1 ) 


The determination of area and yield are very complex with many factors and 
variables entering into the calculations. 

The area in equation ( l) will be examined first. Before the area in a 
certain crop can be measured, it must first be identified. Crop identification 
presents no problem on the ground, but it is often very difficult to do from 
orbital altitudes. In the past it had been thought that specific plants wovild have 
individual spectral signatures that would permit positive identification. There is 
evidence from some research that some plants do possess unique differences in 
their spectral signature in certain narrow bands. However, these subtle differ- 
ences do not show up in the multispectral data obtained by the MSS from Landsat. 
Therefore, other methods must be used to identify and classify the different 
crops. One method to accomplish this is to utilize multitemporal data, or data 
obtained at several different times during the growing season. The signature 
from the different crops can be compared several times (usually a minimum 
of three comparisons) and, with the aid of a crop calendar, the crop can be 
identified. This is illustrated in Figure 2. It is obvious that the signatures for 
spring wheat, winter rye, and buckwheat are very similar. In mid-August it 
would be impossible to distinguish between spring wheat and buckwheat because 
the reflectance is essentially the same. However, if these two crops were 
examined in early August, a significant difference wovild be noticed. At that 
time, the spring wheat has a high reflectance while the buckwheat has a low 
reflectance. There is also a significant difference in early September. In that 
case the buckwheat has a high reflectance while the spring wheat has a low 
reflectance. This is a very simplified case, but illustrates the principle that 
can be used to differentiate crops which have similar signatures. The use of 
this method increases several fold the quantity of data that must be analyzed, 
and places stringent registration requirements on the data since the same 
samples must be analyzed each time. 
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REFLECTANCE 




After a crop is identified, its areal extent must be determined. One 
method to accomplish this is to measure all the areas in cultivation, or perform 
a wall to wall inventory. While this would be desirable, it is generally agreed 
that this would require a prohibitive quantity of data to be processed. Another 
method would be to use sampling techniques in which sample segments are 
obtained throughout the agricviltural regions. Much less data would be required 
to be obtained and processed. The sampling method is used by the SRS for crop 
production forecasting for the U. S. and it is also used in the Large Area Crop 
Inventory Experiment ( LACIE) . It is generally agreed that sampling must be 
used for a global crop production forecasting system, and sampling is utilized 
in the analysis of this objective. 

The second part of equation (l) is yield. Most of the current and 
projected crop yield models utilize meteorological and historical data together 
with data on the current crop conditions to predict yield. Remotely sensed 
data can be used as an input to some of the models to supply information on 
crop condition and meteorological conditions, but these models rely heavily on 
ancillary data. When soil moisture information is available from satellites, it 
can be used as an input to the yield models. 

The scope of the objective requires a global crop production forecast. 
Area and yield must be determined for the crop of interest for each region or 
coimtry. These are then combined to obtain the production forecast for that 
particular region or country. All these factors are then summed to obtain a 
global crop production forecast. It is evident that this will require a large 
amount of data to be obtained and processed in a short period of time if the 
objective is to be met. 


INFORMATION REQUIREMENTS 


The determination of area and yield in a crop forecast requires that many 
physical parameters be identified and measured. To determine which parameters 
are required, the literature was reviewed and many personal interviews were 
conducted with leading scientists in the fields of agriculture, weather and climate, 
and remote sensing. This was done by MSEC personnel and by New Technology, 
Inc. on contract NAS8-31423 [4]. A list of the key individuals contacted along 
with their organization and our assessment of their role in influencing the concept 
of the data system is given in the Appendix. Much valuable information on user 
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requirements and their influence on the data system was obtained by participating 
in the Crop Spectra Workshop held in Sterling, Virginia on February 2-3, 1977 
[5] . Man}" key scientists involved in the acquisition and use of remotely sensed 
data for agricultural purposes were participants in the workshop. The reader is 
referred to the proceedings of the workshop for the specific details of the work- 
shop and a list of attendees. Special attention and consideration was given to 
LACIE and the personnel associated with the experiment [6] . As expected, 
agreement was not xmanimous among all these experts as to what parameters 
should be measured and what influence these parameters would have on the crop 
forecast and hence on the data system requirements. The parameters selected 
to be included in the conceptual data system are based on what is considered to 
be the consensus of the key individuals contacted. Some of these parameters 
may be wholly or partially obtained by remotely sensed data from space while 
in-situ measurements, statistical data, and historical data must also be used to 
obtain others. The parameters to be determined are as follows: 

a. Episodic Events 

b. Availability of Irrigation Water 

c. Potential Productivity 

d. Soil Moisture 

e. Temperature 

f. Photoperiod 

g. Precipitation 

h. Wind 

i. Disease Epidemics 

j. Insect Infestations 

k. Soil Surface Conditions 

l. Plant Density 

m. Soil Fertility. 


The objective to obtain a global crop projection forecast for seven crops 
was broken into first and second level subobjectives and contributing elements. 

The seven crops selected were wheat, rice, corn, sugar, soybeans, cotton, 
and small grains (other than wheat and rice). These crops were selected because 
they are important food or economic crops that are traded in the world market, 
and ones that the U. S. has an appreciable interest or influence in the trade 
thereof. 

The four first level subobjectives influencing the forecast were defined 
as crop survey, crop condition, weather and climate information, and crop 
projection. Area is determined by the crop survey, and yield is determined 
or influenced by the other three first level subobjectives. A block diagram of 
these subobjectives is shown in Figure 3. Each of these will be broken down 
into second level subobjectives and contributing elements. 

The main output of the crop survey subobjective is quantitative informa- 
tion on the area in cviltivation and to be harvested of the seven crops during the 
growing period. This results in the establishment of two second level subobjec- 
tives, surveillance of included crops and statistical data. The surveillance 
consists of the identification of the crops and the mensuration of the area in 
cultivation. Scientists in the Agricxdtural Research Service (ARS) of the USDA 
indicate that it would be desirable to have resolution in the order of 20 m for 
remotely sensed data obtained by satellites. Statistical data on crop calendars 
and cultivation methods will be used as ancillary data in determining the area in 
cultivation for each crop and the area expected to be harvested. These data will 
come from the SRS for the U. S. but must be obtained from the FAS or the Food 
and Agricultural Organization ( FAO) of the United Nations for foreign countries. 
The data base for a sufficient length of time (approximately 10 years) is gen- 
erally not available for countries other than the U. S. The crop survey sub- 
objectives are illustrated in block diagram form in Figure 4. 

The next first level subobjective to be addressed in crop condition. In 
the literature this is sometimes referred to as crop stress or crop vigor. Crop 
stress and vigor together with episodic events influence crop condition. The 
main output of this subbbjective is the tsqje of condition, the intensity of the 
condition, the areal extent of the condition, the duration of the condition, and its 
concentration and amount. The four primary second level subobjectives sup- 
porting the crop condition subobjectives are biological stresses, meteorological 
stresses, soil stresses, and artificially induced stresses. A uniform input 
called "crop nominal profile," feeds into each one of these second level sub- 
objectives. Crop nominal profile, or CNP, is a standard set of values for the 
nominal growth of each crop under consideration on a regional basis. 
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Figure 3. First level subobjectives 








Figure 4. Crop survey subobjective structure 









The major contributing elements or measurable phenomena under 
biological stresses are disease, insects, weeds, and wildlife. Under meteoro- 
logical stresses, the measurable phenomena are precipitation, air pollution, 
humidity, insolation, wind, and drought index. The second level subobjective 
soil stress is supported by' the measurabl6 phenomena soil moisture, soil 
chemistry, and soil temperature. There seems to be some likelihood that soil 
moisture measurements could be obtained by 1985 from satellite radar. The 
condition of irrigated land and monitoring of additional added irrigated land is 
included as one of the measurable phenomena affecting soil stresses. Artificially 
induced stresses include over-fertilization and over-irrigation. These sub- 
objectives and contributing elements are shown in block diagram in Figure 5. 

The data from satellites wovild be obtained from observation of sample segments. 
These samples coxild be either randomly distributed or systematic sampling could 
be used. Observations at 30 day intervals would be sufficient for determining the 
crop conditions and episodic events. Repeated observations of the same sample 
segment would not be required for determining crop condition. 

The output of the weather and climate subobjective is moisture/ plant/ 
day, maximum and minimum temperature per day, the svinlight/ day and the 
severe storm index. The second level subobjectives supporting weather and 
climate information are weather monitoring and statistics, soil water availability, 
long range weather and climate forecast, and short term weather forecast. 

The contributing elements or measurable phenomena which support the 
weather monitoring and statistics subobjective are temperature, wind, sunlight, 
humidity, precipitation, and statistical data. These statistical data consist of 
weather profiles from past years and are obtained from the National Climate 
Center at Ashville, NC. 

The contributing elements or measurable phenomena supporting soil 
water availability are soil temperature, soil type and structure, soil moisture 
data, useful reserves, and irrigation methods. The Heat Capacity Mapping 
Mission Satellite (HCMM) may be utilized in making a soil moisture determina- 
tion of sufficient fidelity to support this contributing element. 

The long range weather and climate forecast second level subobjective 
consists of the following measurable phenomena: temperature (maximum and 
minimum per day) , winds, simlight or cloud cover, humidity, precipitation, 
and statistical data. 
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Figure 5. Crop condition subobjective structure 
















The short-term weather forecast includes the contributing elements of 
measurable phenomena of temperature (maximum and minimum per day), winds, 
sunlight or cloud cover, humidity, precipitation, and severe storm phenomena. 
Figure 6 presents a block diagram showing the weather and climate subobjective 
structure. 

The last first-level subobjective to be addressed is crop projection. 

Crop projection provides the projected acreage of the seven major crops and 
the projected yields for individual crops. This provides the long range forecast 
in contrast to the short term or short range forecast as determined by the crop 
survey, crop condition, and weather and climate subobjectives. The crop 
projections subobjective has as its four primary second level subobjectives 
crop calendar, cropping practices, agri-chemical applications, and current 
remotely sensed data. 

The measurable phenomena supporting crop calendar are growth stage, 
percent ground cover, plant height, and stand quality rating. 

The cropping practices second level subobjective consists of the following 
measurable phenomena or contributing elements; crop rotation practices, 
tillage practices, and irrigation practices. 

Under the second level subobjective agri-chemical applications, the 
following contributing elements are fo\md: growth enhancers, physiological 
stress inhibitors, and biological stress inhibitors. 

These three second level subobjectives (crop calendar, cropping 
practices, and agri-chemical applications) consist essentially of statistical 
data. 


The fourth second level subobjective imder crop projections is remotely 
sensed data (current). Contributing elements supporting remotely sensed data 
are identification, mensuration, condition, and weather and climate parameters. 
The crop projections first level subobjective structure is shown in block diagram 
form in Figure 7. 

Throughout the examination of these subobjectives, the role of ancillary 
data was noted. It should be emphasized that ancillary data are essential in 
addition to remotely sensed data in the design and implementation of a data 
management system. 
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Figure 6. Weather and climate information subobjective structure 
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Figure 7. Crop projection subobjective structure 


UPDATE TO STATISTICAL DATA 




FUNCTIONAL BLOCK DIAGRAM 


A concept of a functional diagram for a global crop production forecasting 
system is shown in Figure 8, The multispectral data obtained from a thematic 
mapper (TM) and/ or multispectral scanner (MSS) located on a satellite are 
utilized in crop identification and mensuration. These data are selected for the 
desired cloud cover, are geometrically and radiometrically corrected, and have 
been corrected for atmospheric effects in fj. In f2 the crops are identified by 
using multispectral and multitemporal data together with the ancillary data such 
as crop calendar and historic data. Three observations during the growing 
season, at specified times determined by the phenologic differences between the 
principal and main confuser crops, will be needed. Observations of these sample 
segments must be repeated three times during the growing season. To have the 
required number of samples at the end of the growing season, a much larger 
number of observations of the samples must be obtained at the beginning of the 
growing season to compensate for those samples lost due to cloud cover. 
Mensuration will be performed in fs using a sampling system of the type used by 
SAS. The mensuration and identification data will be combined in f4 to give the 
area. Satellite multispectral data and ancillary data on crop condition, eposodic 
events, and soil moisture will input to the plant condition model at fs to deter- 
mine the crop condition. Satellite meteorological data, in-situ meteorological 
data, and historical meteorological data input to the meteorological model at fe. 
Hydrologic data from spacecraft, in-situ measurements, and historical data 
input to the hydrologic model at ii. From the hydrologic model, water availabil- 
ity will be calculated in fs. The agromet model, fg, combines the outputs from 
the meteorological model, water availability and plant condition models, and 
ancillary data such as soil nutrients, soil temperature, and fertilizer applica- 
tions. The output of this model results in yield at fjg. The flow previously 
described must be repeated for each different crop, region or country, and 
different growing condition. All these repetitions are summed in f^j to obtain 
the production forecast. The resulting forecast report is generated at fjg and 
distributed to the users. 


DATA SYSTEMS REQUIREMENTS 


The global crop production forecast will require a large amount of data 
to be acquired and processed. The amoimt of data will be much greater than is 
now required for the LACIE. The LACIE is limited to wheat, and the regions 
where most of the wheat is grown are more favorable for obtaining remotely 
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Figure 8. Functional diagram, global crop production forecasting, 



sensed data than those of most other crops. Wheat has a long growing season 
and is generally grown in areas with less than 50. 8 cm of rainfall annually and 
hence having less cloud cover. At the other extreme, rice has a short growing 
season of approximately 30 days and is generally grown in areas of high rainfall 
and high humidity which decreases the probability of obtaining good remotely 
sensed data. A trade study was performed to determine the effects of the many 
variables on the quantity of data that would be acquired and processed. 


Global Crop Production Forecasting Trade Study 

To place the goals of the Global Crop Production Forecasting Trade Study 
in determining data system requirements in proper perspective, it may be help- 
ful at this time to briefly review what has been discussed so far. This report 
began with the rather general objective to provide a biweekly forecast of the 
global production of major crops having world wide nutritional and/ or economic 
significance. To make such a broad objective amenable to detailed analysis, it 
was scoped and quantified in terms of relevant crops, producing regions, pri- 
mary users of the forecasting information (AID and FAS), achievable forecasting 
accuracies, and reporting frequencies. Some of the many variables associated 
with computing crop areas and yields were discussed together with the diverse 
information sources needed to make these computations. A functional block 
diagram for a global crop production forecasting system was described, showing 
the sequence of steps leading to a forecast report. Lying at the heart of all of 
this, and crucial to the production of any forecast report, is an end-to-end 
processing system with the capability to gather and manipulate the enormous 
quantities of remotely sensed image data entailed by the global nature of this 
objective. As a first step toward defining concrete data systems requirements, 
the Global Crop Production Forecasting Trade Study had set for itself two 
principal goals: (l) determine the data processing load for an operational global 
crop production forecasting system as a function of data frequency, crop types, 
their biophases, cloud coverage, and number of satellites; and (2) in case the 
data load exceeded projected processing capabilities, investigate and propose 
alternate strategies, e.g. , editing, sampling, to reduce the load while still 
achieving the forecast accuracy given in the revised objective. Considering the 
complexities and unknowns involved in attaining these goals, certain basic and, 
in some instances, simplifying assumptions had to be made to establish reason- 
able bounds within which the analysis could proceed. This being the case, it 
naturally follows that for any interpretation of the trade study results to be 
valid, reference must be made to this basic framework of underlying assumptions. 
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Baseline Trade Study Ground Rules 

La establishing a realistic mission baseline, two constraints were placed 
on the study. First it was assumed that the system woidd be operational in the 
1985 time frame, far enough into the future to place it beyond currently planned 
programs, but close enough to be able to predict with some certainty related 
technological development. The study was placed within the context of what is 
likely in 1985 given normal progress, not what is possible, so that the study 
goals could be pursued without the necessity of assuming significant technological 
breakthroughs. This meant that any data system components needed for a pro- 
jected 1985 operational system were already available, or at least vuider lab- 
oratory development. Furthermore, this involved a pragmatic approach in 
harmony with currently known and planned capabilities. A conceptional data 
system to perform global crop production forecasting could have been defined 
which was divorced from programmatic and cost considerations, but it would 
have had little success of being implemented, no matter what its technological 
sophistication. 

Second, only the ’’front end,” the ':a acquistion and groxind preprocessing 
for radiometric, geometric, and format adjustments, of the total end-to-end data 
system was studied in detail. A system capable of processing 70 full scenes 
per day, similar to the currently planned GSFC Landsat-D ground preprocessing 
system (50 full and 50 partial Thematic Mapper scenes per day) , was assumed 
as a base line with which to compare the operational data load for a global sys- 
tem. Accuracy of classification and yield prediction capability were presumed 
to be adequate. Ancillary data and data from nonspace platforms or satellites 
beyond the Thematic Mapper were not included but are planned to be incorporated 
in future studies. However, in the present study no exhaustive attempt was 
made to define data system requirements in the extractive and subsequent 
processing steps. 

Within the two basic constraints previously discussed, certain additional 
specific assumptions were made in an effort to accurately size the data load. 

The space platform orbit was taken as a Landsat-D Sun-synchronous one with a 
repeat cycle of 16 to 18 days and a 185 km swath width. Sun synchronization was 
chosen to avoid possible problems in classification due to a varying Sun angle. 

The sensor would be a passive whisk broom Thematic Mapper type with a 30 m 
instantaneous field of view, in all probability the spatial resolution limit in 1985 
for use in crop estimation. Target areas of interest would include all land areas 
containing crops of significance, with the frequency of coverage dependent on 
such factors as crop types, biophases, cloud cover, etc. Engineering specifica- 
tions for the orbital and sensor characteristics are found in Table 1. 
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TABLE 1. THEMATIC MAPPER CHARACTERISTICS 


Orbital Characteristics 

Altitude at Equator 

705. 3 km 

Altitude at Pole 

723.2 km 

Velocity 

7. 5027 km/ sec 

Ground Trace Velocity 

6. 87 km/ sec 

Period 

5932. 82 sec 

Inclination 

98.2° 

Repeat Cycle 

16 days 

Orbits per Repeat Cycle 

233 

Overlap at Equator 

7.125% 

Overlap versus Latitude {(p) 

(100-92.875 cos <p)% 

Sensor Characteristics 

Swath Width 

185 km 

Bands 

5-30 m Bands 
1-120 m Band 

Bits/ Band 

8 

Pixels 

6167 X 6167 — Hi Resolution 

Pixels 

1542 X 1542 — Low Resolution 

Data Bits/ Scene 

1540 MBITS 

Data Rate 

61. 6 mb/ sec 

Data Rate (Including Calibration 
and Ancillary Data) 

84 MB/ sec 
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The next major set of assumptions concerned the selection of the crops 
that would be the source of the remotely sensed data burdening a global crop 
production forecasting system. The criteria employed in the selection of 
representative crops required that they be limited to those significantly affecting 
the data load; that they be economically important; and that they be crops for 
which adequate statistical data were available to make a determination of yearly 
production, areas cultivated, and regions in which grown. It was found that wide 
variations existed in these statistical data, and that no single source could be 
consulted for all the desired information. Yearly production fluctuations, differ- 
ent reporting practices, and lack of data on different crops for the same time 
period made the task of selecting candidate crops extremely difficult. Seven 
crops were finally selected: wheat (winter and spring) , corn, rice, potatoes, 
sugar (beet. and cane), small grain (barley, oats, rye, millet), and soybeans. 
Wheat leads the list in economic importance, has large cultivated areas, and is 
the one crop for which the use of remotely sensed data for production forecasting 
has been demonstrated with some success. The next to be judged as satisfying 
the selection criteria was corn. The third crop, rice, offers some unique 
problems. It is usually planted in small fields, growing seasons often overlap 
resulting in a given region having fields with different biophases present simvil- 
taneously, and it is commonly planted in areas experiencing greater cloud cover 
than those for other crops. All these factors impose rather stringent data 
gathering requirements. In the course of the trade study, these three crops 
(wheat, corn, rice) were found to contribute 80 percent of the data system load 
and, therefore, were given the most thorough examination. The corresponding 
cultivated areas and producing countries used to bound the data load are pre- 
sented in Table 2 [7] . 

TABLE 2. AREA OF REGIONS USED TO BOUND CROPLAND 
IN REPRESENTATIVE COUNTRIES 


Coimtry 

Wheat 

(km2) 

Corn 

(km2) 

Rice 

(km2) 

U.S. 

2 075 168 

2 908 159 

309 124 

U.S.S.R 

6 841 899 

1 299 892 

336 745 

China 

2 872 896 

4 111 510 

3 854 329 

India 

1 538 475 

1 178 227 

2 674 000 

Australia 

971 593 

67 648 

62 225 

Poland 

330 138 

0 

0 

Venezuela 

0 

0 

153 400 

Canada 

936 767 

50 875 

0 

i 
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Analysis Approach Used 


Much of this trade study was performed using the Data Systems Dynamic 
Simulator (DSDS) developed by MSFC. Because of the complicated nature and 
the many interacting real-time variables comprising such a study, the DSDS 
was ideally suited and readily available for producing meaningful and timely 
results. The DSDS is a reconfigurable software simulation system consisting 
of 6 basic core models and 136 precoded data system element models which may 
be connected in whatever way required to simulate a particular data system of 
interest. They range from high level components of which satellites, ground 
stations, and operation control centers are examples, to more detailed data 
system elements such as sensors, transmitters, receivers, and timing units. 

Once a specific data system or subsystem is configured and exercised, the 
DSDS generates information reports on the system’ s performance, allowing the 
system to be restimctured if it does not function within specifications. For this 
study, four models were used in constructing the data system: the Mission Model 
and Throughput Model were core models already part of the DSDS system; and 
the Crop Model and Cloud Model were added to the simulation system. To 
properly understand their functions, relationships to each other, and use in this 
study, individual descriptions are in order. 

For determining crop viewing times, the Mission Ephemeris Generator 
of the DSDS Mission Model was used to simulate a Landsat-D type helisynchronous 
skip orbit. It has a 9:30 a. m. local Sun time equatorial crossing, a 705. 3 km 
altitude at the equator, an inclination of 98.2 degrees, a 5932.82 second period, 
and a 16^ay repeat cycle. The space craft velocity was 7. 5 km/ sec. Within 
the Mission Model, the ephemeris was updated 360 times/ orbit, with the plat- 
form nadir checked at each update for crop cell crossings. The crop cell 
identification, target acquisition, and target loss times were then provided to 
the other models so that they might determine the crops in view, their biophases, 
the cloud conditions, and the data rates at this point in time. 

The crop model contained all the information needed to simulate the crop 
types growing, their locations, areal extent, and biophases on the Earth’ s sur- 
face as a function of latitude. This information on crop producing areas and 
crop calendars was primarily obtained from the Oxford Economic Atlas of the 
World [8] and the World Atlas of Agriculture [7] supplemented with 1976 data 
from the USDA. When insufficient crop calendar data existed for a partictilar 
crop, calendar data for a similar crop were used. Also, because of the uncer- 
tainty of making long range cropland usage predictions, no attempt was made 
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to project usage to 1985. The growing pattern for each grain crop was modeled 
using four biophases: the period from planting to emerging, from emerging to 
heading, from heading to ripening, and from ripening to harvest, hi actual 
practice such clear cut distinctions cannot always be made for all crops, at all 
times, in all locations, but for the puiposes of this simulation such was assumed. 
To make the information contained in the Crop Model compatible with the Mission 
Model, a cell structure which divided the Earth’ s surface into 10 368 2 1/2 x 
2 1/2 degree cells was used. Since the areas of these cells varied with latitude, 
while the image data in an operational system would be acquired in a continuous 
185 km wide swath, it was necessary to use a conversion factor between the 
varying cell sizes and the constant Landsat scenes. As the mission Model 
acquired a crop cell, it relayed the acquisition and loss times to the Crop Model. 
Based on these times and the cell' s location on the Earth’ s surface, the Crop 
Model determined the crops growing in this cell, their biophases, and called 
the Cloud Model for the extent of cloud cover for the current time, date, and 
cloud region in which the cell was located. Due to the lack of precise crop data 
and to increase simulation efficiency, certain approximations were necessary. 
For example, all cells for the same crop and latitude used the same crop 
calendar. 

The data used in the Cloud Model were prepared from statistics com- 
piled by Allied Research Associates [9]. Cloud cover statistics in this report 
were gathered from approximately 100 worldwide observation stations over a 
10 to 15 year period. Five cloud cover categories were defined for 30 climato- 
logical regions covering 80 geographic world areas. These categories were 0 
to 10 percent, 0 to 30 percent, 0 to 50 percent, 0 to 90 percent, and 0 to 100 
percent cloud cover. Because the statistical data in the ARA report were based 
on the cloud coverage as seen by a ground observer with a 30 n, mi. field-of- 
view versus satellite observation with a different field-of-view, the data had to 
be scaled to the satellite scene size before incorporation into the Cloud Model. 
Also, since the satellite was assumed to have a 9:30 a. m. equatorial crossing, 
only the statistics for this time frame were used. In an attempt to validate the 
scaled statistics employed in the Cloud Model, a comparison was made between 
them and cloud cover statistics contained in the Marshall Earth Resources 
Information Transfer System data base. Scaled Cloud Model statistics from 
May through September for the eastern half of the United States over a span of 
several years were compared to actual cloud cover percentages contained in 
previously acquired Landsat scenes for the same area and time. For 50 percent 
or less cloud cover, the difference between cloud cover in Landsat scenes and 
the scaled cloud cover statistics used in the simulation was approximately 
2 percent. 
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The final model to be considered is the Throughput Model. It controlled 
the sensor data rates, i.e.,the amount of data generated for each crop target 
based on the engineering specifications of the sensor, in this case a Thematic 
Mapper producing a full 185 by 185 km scene every 24 sec. In addition, using 
information supplied by the Mission, Crop, and Cloud Models, it monitored data 
transit times, throughput rates, and processing delays, thus providing the capa- 
bility of generating simulation performance reports. The stage was now set for 
pursuing the two principal goals of this trade study; (l) determining data loading, 
and (2) Investigating means of reducing the data loading through editing and 
sampling strategies. 


Results Achieved 

Using the DSDS to correlate the interrelated influences of orbital param- 
eters, crop calendars, and cloud conditions, comprehensive sets of global data 
loading profiles were generated. The effects on the data load of various crops 
and their biophases were investigated together with the effects of cloud cover 
and the number of satellites in orbit. Schemes for reducing the data load through 
cloud rejection editing and sampling strategies were also studied. All these 
analyses produced results far too extensive to be covered in their entirety in 
this report alone, and the reader is referred to General Electric Report 
77HV091 [10] for detailed discussions. 

The first phase of the study centered on the generation of day-to-day data 
loading profiles for all the major crops. The upper left hand chart in Figure 9 
shows the global viewing time in minutes per day from January to December for 
wheat. This data loading profile is for the possible viewing time of one satellite 
regardless of cloud cover, i.e. ,for how many minutes per day would one satellite 
sweep out a continuous 185 km swath on the Earth* s arable surface containing 
only wheat in any of four biophases. As can be seen from Figure 9, the peak 
viewing time extends from early May to early November, the growing season for 
spring wheat in the Northern Hemisphere. Daily viewing times during this peak 
period range from a minimum of 17 min (42 scenes per day) to a maximum of 
34 min (or 85 scenes per day) . The daily vertical fluctuations in the graph are 
due to the areal distributions of the Earth’ s croplands as viewed from the 
satellite. 
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With the addition of corn, the average daily viewing time increases from 
approximately 25 to 36 min per day, while for all crops the average daily viewing 
time is 64 min or 160 scenes, with a peak of 73 min or 183 scenes per day. A 
tabulation of Figure 9 is given in Table 3. 

TABLE 3. EFFECTS OF ADDITIONAL CROPS ON DATA LOADING 


Crops 

Average Daily Viewing Time 
in Minutes for June Through 
September 

Scenes per Day 
(185 km^ per 
Scene) 

Wheat 

25 

62 

Wheat, Corn 

36 

90 

Wheat, Corn, Rice 

47 

117 

All Crops 

64 

160 


Simulations to determine viewing times for multiple satellites were also 
conducted in an effort to see what effect this would have on the data load. For 
all crops and cloud conditions, two satellites with an 8 day separation gave a 
daily peak viewing time of 134 min, and an average of 131 min. Three satellites 
with a 5 day separation resulted in 205 min of peak viewing time and 196 min of 
average viewing time per day. It is evident from these results that processing 
full scenes from just one satellite for all crops far exceeds planned processing 
capacity (70 scenes per day or 28 min of viewing time) , without considering 
multiple satellite cases. The necessity for some type of editing is obvious. 

Figure 10 shows the combined effects of crop and cloud cover editing on 
the data load from one satellite. In the upper left hand chart, average through- 
put (scenes per day) for a 16 day period in June is plotted against cloud cover 
acceptance criteria for various combinations of crops. The horizontal dotted 
line represents the planned processing capacity of 70 full scenes per day. Taking 
the plot for wheat, corn, and rice, if all scenes with 30 percent or less cloud 
cover (cloud category two) are accepted, 60 scenes per day will need to be 
processed. If scenes with 50 percent or less cloud cover (cloud category three) 
are accepted, 75 scenes per day on the average remain to be processed, already 
exceeding the planned capability. The lower left hand chart shows the peak 
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Figure 10. Effect of editing on data loading, 



processing requirement for the same time period. Here, if scenes containing 
wheat, corn, and rice with 30 percent or less cloud cover are accepted, 77 
scenes still remain to be processed for the worst daily case. All four charts 
indicate that severe crop and cloud cover editing would have to be employed to 
reduce the daily data load, but at the cost of losing scenes with usable information 
content. A much more sophisticated approach is needed. 


Sampling 

A sampling technique similar to that being used in the lACIE Program 
was considered as the next most feasible approach for reducing the data volume 
while retaining scene information content. 

It has been stated previously that it is necessary to have three observa- 
tions of a particular sample segment during the growing season to accurately 
identify the crops. Since some of the samples will be under cloud cover during 
the subsequent passes, it is necessary to obtain data on many more samples on 
the first pass to have the required number at the end of the three growth phases. 
Calculations were made to determine the number of samples required for three 
crops in three countries to obtain some indication of the magnitude of the data 
required. These calculations were for corn in the U. S. , wheat in Canada, and 
rice in India. A brief description of the methods used follows. 

The proportion of each region being sampled is determined by the ratio, 
or proportion, of the area of the crop being inventoried to the total agricxiltural 
area in the country. The figures for areas are based on historical data, and it 
is realized that the current crops may not be of exactly the same proportion, 
which is the reason the inventory is being done. However, the use of historical 
data to determine the number of samples required is acceptable sampling theory. 

Bernoullian distribution is assumed to determine the number of samples 
required. The region to be sampled is divided into a number of segments. If 
the segment contains more of the crop being inventoried than the proportion for 
the entire region in the crop, it is assigned a value of 1. If it contains less than 
the proportion for the entire region, it is assigned a value of 0. The proportion, 
p, can then be calculated for the number of segments having values of 1 and the 
total number of segments. 
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In Bernoxillian distribution, the standard deviation, o-, is 
0 - =np(l -p) 
and the mean, /k, is 


M = np 


( 2 ) 


( 3 ) 


where p is the proportion of the crop in the area being sampled and n is the 
number of samples [11] . 

The relative sampling error, e, is 
^ _ g ^ np(l - p1 ^ 

H np np 


Since the Bernoullian distribution is only an approximation, the sampling 
error determined by equation (4) will be an approximation. If the distribution 
is normal, or Gaussian, the approximation will be quite good. The question is, 
how close do the samples from the Bernoullian distribution approach Gaussian 
distribution. It is a well known statistical condition that if a large number of 
samples are drawn from a non-Gaussian distribution, the distribution of the 
sample will approach Gaussian distribution. Generally, this will be the case 
if thie number of samples is greater than 30. It is thus assumed that for the 
number of samples used, the distribution will be Gaussian. 

The sample error determined by equation (4) is then based on a Gaussian 
distribution and will be correct for a confidence level of 1 a (68.27 percent). 

If the confidence level is to be greater than la, it is intuitively recognized that 
the error in equation (4) must be modified by a confidence multiplier, k. 
Equation (4) is rewritten as 
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For confidence of 68.27 percent, the multiplier is k= 1. For 90 percent 
confidence, k = 1. 645. The value of k for any confidence level can be determined 
from a normal distribution curve or table [12] . 

To compute the number of samples required for a given allowable error 
and confidence level, equation (5) can be rewritten as 


n 


10^(1 - p ) 


( 6 ) 


where e is in percentage. 

The number of samples required for the three crops and countries 
previously mentioned were calculated using allowable accuracies of 90, 95, 
and 98 percent with a confidence level of 90 percent. The areas in each country 
and crop were obtained from Reference 7. The number of samples required are 
presented in Table 4. 

To have required number of samples at the end of the three observations, 
it is necessary to obtain more sample observations on the first two observation 
periods. The number of samples required is influenced by the number of 
observations possible and the effects of cloud cover, and can be determined by 
dividing the number of samples required by the product of the probabilities that 
an observation can be obtained or 


. ^ , required number of samples 

initial number of samples = , (7) 

(probability) 


where x is the number of observations required of each sample segment. The 
probability numbers are a function of the cloud cover and the amount of overlap 
for each scene. The calculation of the probability numbers is complex, and the 
method and probability numbers used are available in Reference 10. 
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TABLE 4. NUMBER OF SAMPLES REQUIRED 


09 

t>9 


Crop 

Country 

Agricultural 

Area 

(1000 Ha.) 

Specific Crop 
Area 

(1000 Ha.) 

Proportion 
in Crop 

Accuracy Required 
at 90% 
Confidence 

Number of 

a 

Samples Required 

Corn 

178 736 

34 225 

0.19 

90 

1 200 

U.S. 




95 

4 700 





98 

30 000 

Wheat 

41 845 

11 340 

0.27 

90 

750 

Canada 




95 

3 000 





98 

19 000 

Rice 

147 823 

35 815 

0.24 

90 

870 

India 




I 95 

3 500 





98 

22 000 


a. Rovinded to two significant figures 




The number of samples required on the initial observation to obtain the 
required number were calculated for the three crops previously used. It was 
assumed that scenes having 50 percent or less cloud cover would be usable. 

The samples required for one, two, or three satellites are given in Table 5 for 
the allowable accuracies and confidence levels previously assumed. 

The previous calculations show a requirement for three Landsat-D type 
satellites each with a 16 day repeat cycle. The data would be relayed to the 
ground by the use of TDRSS. There is a problem with tiie use of TDRSS since 
the zone of exclusion includes most of India, and parts of Pakistan, U.S. S.R. , 
and Peoples Republic of China. The zone of exclusion is shown in Figure 11. 
Eliminating this much of the world* s agriculture would have a serious adverse 
effect on the global crop forecast. Also to obtain all the data required from the 
satellite would require a dedicated TDRSS channel. 

To relay the data from the TDRSS groimd receiving station to the ground 
processing facility will require a dedicated DOMSAT channel and will result in 
a significant cost of data transmission. 

A study was performed by General Electric [13-17] for GSFC which 
indicated a need for processing 480 scenes a day for an agricultural mission. 
This requirement far exceeds the planned capability. The results of the GE 
study are not repeated here. 

The conceptional data system, with data rates, is shown in Figure 12. 


CONCLUSIONS AND RECOMMENDATIONS 


Based on the resvilts of this investigation, it can be concluded that it is 
not possible to meet the revised objective (as previously stated) with the pro- 
jected data systems available in the 1985 time frame. Some of the reasons are 
as follows: 
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TABLE 5. INITIAL NUMBER OF SAMPLES NEEDED TO HAVE REQUIRED SAMPLES 
AFTER 3 OBSERVATIONS (SHOWN FOR 1, 2, 3, SATELLITES) 


CO 


Crop 

Country 

Accuracy at 
90% Confidence 

Samples 

Required 

Oversampling Required 
for Satellites 

1 

2 

3 

Corn 

90 

1 200 

2 374 

1 396 

1 245 

U.S. 

95 

4, 700 

9 297 

5 468 

4 877 


98 

30 000 

59 346 

34 903 

31 129 

Wheat 

90 

750 

1 029 

785 

756 

Canada 

95 

3 000 

4 115 

3 140 

3 026 


98 

19 000 

26 059 

19 884 

19 163 


90 

: 870 

2 022 

1 056 

911 

95 

3 500 

8 136 

4 247 

3 666 

98 

' 22 000 

51 142 

26 694 

23 041 


Rice 

India 
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Figure 12. Block diagram of conceptual data system 









Data Collection 


Three satellites are required and these are not projected to be available. 

The peak scene processing requirement of 480 scenes per day is far 
beyond the planned capability. In fact, processing full scenes from only one 
satellite for all crops is beyond the capability of any planned system. 

Onboard processing would reduce the quantities of data being transmitted 
to the ground. For example, simple crop and cloud editing can significantly 
reduce the data load, but at the possible expense of losing valuable information 
content which would reduce the accuracy of the forecast. Scenes with 50 percent 
or less cloud cover appear to be usable. However, for all crops, only scenes 
with 30 percent or less cloud cover could be processed daily by the assumed 
system. 

A sampling approach was shown to reduce the data load to an acceptable 
level (under certain assumed conditions the equivalent of 3. 14 scenes versus 
84.9 scenes for corn in the U. S. ) while preserving information content. How- 
ever, oversampling must be used to reduce statistical error caused by cloud 
conditions and viewing opportunities for different crops. 

The TDRSS must be upgraded to include a dedicated channel to handle 
the additional data requirements ( three satellites and two TDRS’ s with only one 
satellite transmitting at any given time) . Also the zone of exclusion must be 
eliminated. 

DOMSAT must be utilized for ground to ground communication if the 10 
hr turnaround time is to be maintained. Location of the preprocessor close to 
the TDRSS ground station should result in a time and cost savings. 


Information Extraction 

A breakthrough is needed in crop identification if the processing require- 
ments are to be reduced, but this is not likely to occur. The use of multitem- 
poral data is a method to overcome the identification problem, but its use 
greatly increases the processing, and no other method of accurate identification 
is available or expected to be available by 1985. 
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There is a lack of a historic data base for crop calendars and agrictdtural 
practices outside the U.S. This data base should be for a minimum of 10 years. 

A reduction of accuracy in the forecast would result in much less data 
processing. If the accuracy requirements were reduced from 98 percent to 
95 percent or 90 percent, the number of samples would be greatly reduced. 

This effect is shown in Table 4. With the samples reduced, the processing load 
would be greatly reduced. In fact, 95 percent accuracy on a worldwide basis 
appears acceptable. 


Additional Studies 

Figure 12 identifies a number of issues not yet resolved, indicating the 
need for additional trade studies to be performed. Some of these studies are: 

1. Determine the data systems costs associated with each satellite con- 
figuration (l, 2, or 3) taking into account varying altitudes, swath width, 
spatial resolutions, orbital (skip or retrograde) periods, and sensor pointings. 

2. Using precise definitions of windows, length of times, number of 
samples by coxmtry or region, and make-up of samples ( mxiltipuipose or single 
purpose) , study various editing techniques to determine processing requirements 
more accurately than was done in this present analysis. 

3. Develop cost estimates for an agricultural ground processing, system. 
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APPENDIX 

LIST OF KEY INDIVIDUALS AND THEIR ROLE IN INFLUENCING 
THE CONCEPT OF THE DATA SYSTEM 


No. 

Organization 

Key Individual 

"Data Systems" Eole 

1 

USDA/ AES/ Weslaco, TX 

Dr. Craig Wiegand 

Influences requirements 

2 

USDA/ AES/ Weslaco, TX 

Dr. Jerry Eichardson 

Influences requirements/ uses data-re search 
mode 

3 

USDA/ AES/ Weslaco, TX 

Mr. Paul Nixon 

Influences requirements/ uses data- 
research mode 

4 

USDA/ AES/ Weslaco, TX 

Mr. Eoss Learner 

Influences requirements 

5 

USDA/ AES/ Weslaco, TX 

Mr. Joe Cuellar 

Influences requirements -ground truth 

6 

USDA/ AES/ Akron, CO 

Dr. Darryl Smika 

Influences requirements/uses data-yield 
models 

7 

NASA/ JSC/ Houston, TX 

Mr. Norm Foster 

USDA requirements for various crops, and 
system definitions for other USDA require- 
ments 

8 

NASA/ JSC/ LACIE/ Houston, TX 

Mr. Wayne Eaton 

LACIE project management 

9 

USDA/ JSC/ LACIE/ Houston, TX 

Mr. James Murphy 
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